The EDA on the war will use two datasets.
Important: Each new record is accumulated data from previous days.
Important. Data will be updated daily
We will talk about the following:
Source of data:source
Acronyms
Import the libraries
import json, requests
import pandas as pd
import numpy as np
import seaborn as sns
import plotly.express as px
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
response1 = requests.get('https://raw.githubusercontent.com/PetroIvaniuk/2022-Ukraine-Russia-War-Dataset/main/data/russia_losses_equipment.json')
response1.status_code
200
dict1 = response1.json()
Dataframe for losses on equipments
df1 = pd.DataFrame(dict1)
df1.head()
| date | day | aircraft | helicopter | tank | APC | field artillery | MRL | military auto | fuel tank | drone | naval ship | anti-aircraft warfare | special equipment | mobile SRBM system | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2022-02-25 | 2 | 10 | 7 | 80 | 516 | 49 | 4 | 100 | 60 | 0 | 2 | 0 | NaN | NaN |
| 1 | 2022-02-26 | 3 | 27 | 26 | 146 | 706 | 49 | 4 | 130 | 60 | 2 | 2 | 0 | NaN | NaN |
| 2 | 2022-02-27 | 4 | 27 | 26 | 150 | 706 | 50 | 4 | 130 | 60 | 2 | 2 | 0 | NaN | NaN |
| 3 | 2022-02-28 | 5 | 29 | 29 | 150 | 816 | 74 | 21 | 291 | 60 | 3 | 2 | 5 | NaN | NaN |
| 4 | 2022-03-01 | 6 | 29 | 29 | 198 | 846 | 77 | 24 | 305 | 60 | 3 | 2 | 7 | NaN | NaN |
response2 = requests.get('https://raw.githubusercontent.com/PetroIvaniuk/2022-Ukraine-Russia-War-Dataset/main/data/russia_losses_personnel.json')
response2.status_code
200
dict2 = response2.json()
Dataframe for losses on personnel and POW
df2 = pd.DataFrame(dict2)
df2.head()
| date | day | personnel | personnel* | POW | |
|---|---|---|---|---|---|
| 0 | 2022-02-25 | 2 | 2800 | about | 0 |
| 1 | 2022-02-26 | 3 | 4300 | about | 0 |
| 2 | 2022-02-27 | 4 | 4500 | about | 0 |
| 3 | 2022-02-28 | 5 | 5300 | about | 0 |
| 4 | 2022-03-01 | 6 | 5710 | about | 200 |
Merge both the dataframes
raw_df = pd.merge(df1,df2)
raw_df.head()
| date | day | aircraft | helicopter | tank | APC | field artillery | MRL | military auto | fuel tank | drone | naval ship | anti-aircraft warfare | special equipment | mobile SRBM system | personnel | personnel* | POW | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2022-02-25 | 2 | 10 | 7 | 80 | 516 | 49 | 4 | 100 | 60 | 0 | 2 | 0 | NaN | NaN | 2800 | about | 0 |
| 1 | 2022-02-26 | 3 | 27 | 26 | 146 | 706 | 49 | 4 | 130 | 60 | 2 | 2 | 0 | NaN | NaN | 4300 | about | 0 |
| 2 | 2022-02-27 | 4 | 27 | 26 | 150 | 706 | 50 | 4 | 130 | 60 | 2 | 2 | 0 | NaN | NaN | 4500 | about | 0 |
| 3 | 2022-02-28 | 5 | 29 | 29 | 150 | 816 | 74 | 21 | 291 | 60 | 3 | 2 | 5 | NaN | NaN | 5300 | about | 0 |
| 4 | 2022-03-01 | 6 | 29 | 29 | 198 | 846 | 77 | 24 | 305 | 60 | 3 | 2 | 7 | NaN | NaN | 5710 | about | 200 |
#Drop the column personnel*
raw_df.drop(['personnel*'],axis = 1,inplace = True)
#Capitalize the first letter of every word for column names
raw_df.columns = [i.title() for i in raw_df.columns]
raw_df.columns
Index(['Date', 'Day', 'Aircraft', 'Helicopter', 'Tank', 'Apc',
'Field Artillery', 'Mrl', 'Military Auto', 'Fuel Tank', 'Drone',
'Naval Ship', 'Anti-Aircraft Warfare', 'Special Equipment',
'Mobile Srbm System', 'Personnel', 'Pow'],
dtype='object')
#Rename the columns
raw_df = raw_df.rename(columns = {'Field Artillery':'Field_Artillery','Military Auto':'Military_Auto','Fuel Tank':'Fuel_Tank',
'Naval Ship':'Naval_Ship','Anti-Aircraft Warfare':'Anti_Aircraft_Warfare','Special Equipment':'Special_Equipment',
'Mobile Srbm System':'Mobile_Srbm_System','Apc':'APC','Mrl':'MRL' })
#Calculate the total of equipments lost per day
raw_df['Equipment_Per_Day'] = raw_df[raw_df.columns[2:15]].sum(axis = 1)
raw_df.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 43 entries, 0 to 42 Data columns (total 18 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Date 43 non-null object 1 Day 43 non-null object 2 Aircraft 43 non-null int64 3 Helicopter 43 non-null int64 4 Tank 43 non-null int64 5 APC 43 non-null int64 6 Field_Artillery 43 non-null int64 7 MRL 43 non-null int64 8 Military_Auto 43 non-null int64 9 Fuel_Tank 43 non-null int64 10 Drone 43 non-null int64 11 Naval_Ship 43 non-null int64 12 Anti_Aircraft_Warfare 43 non-null int64 13 Special_Equipment 25 non-null float64 14 Mobile_Srbm_System 16 non-null float64 15 Personnel 43 non-null int64 16 Pow 43 non-null int64 17 Equipment_Per_Day 43 non-null float64 dtypes: float64(3), int64(13), object(2) memory usage: 6.4+ KB
#Convert date to datetime
raw_df['Date'] = pd.to_datetime(raw_df['Date'])
raw_df['Day'] = raw_df['Day'].astype(int)
#Sort the null values in descending order
raw_df.isna().sum().sort_values(ascending = False)
Mobile_Srbm_System 27 Special_Equipment 18 Date 0 Day 0 Pow 0 Personnel 0 Anti_Aircraft_Warfare 0 Naval_Ship 0 Drone 0 Fuel_Tank 0 Military_Auto 0 MRL 0 Field_Artillery 0 APC 0 Tank 0 Helicopter 0 Aircraft 0 Equipment_Per_Day 0 dtype: int64
raw_df['Special_Equipment'].unique()
array([nan, 10., 11., 12., 13., 15., 16., 18., 19., 21., 22., 24., 25.])
raw_df['Mobile_Srbm_System'].unique()
array([nan, 2., 4.])
#Calculate the percentage of null values in our columns
missing_percentages = raw_df.isna().sum().sort_values(ascending = False)/len(raw_df)*100
missing_percentages
Mobile_Srbm_System 62.790698 Special_Equipment 41.860465 Date 0.000000 Day 0.000000 Pow 0.000000 Personnel 0.000000 Anti_Aircraft_Warfare 0.000000 Naval_Ship 0.000000 Drone 0.000000 Fuel_Tank 0.000000 Military_Auto 0.000000 MRL 0.000000 Field_Artillery 0.000000 APC 0.000000 Tank 0.000000 Helicopter 0.000000 Aircraft 0.000000 Equipment_Per_Day 0.000000 dtype: float64
# Plot a horizontal bar graph of only null values
percentage_plot = missing_percentages[missing_percentages != 0 ].plot.bar(rot = 0,figsize = (10,10),color = 'lightblue')
plt.ylabel('Missing Values',size = 13)
plt.title(' Missing % of null values for equipments', size = 15)
plt.grid(axis = 'y',linestyle = '--',alpha = 0.8)
plt.bar_label(percentage_plot.containers[0],size = 10)
[Text(0, 0, '62.7907'), Text(0, 0, '41.8605')]
#Check for null rows
null_columns = raw_df[['Mobile_Srbm_System','Special_Equipment']][raw_df.isnull().any(axis = 1)]
We choose to replace null values with zero as data has not been provided for "NaN" values.
raw_df.fillna(0,inplace = True)
raw_df.isnull().any()
Date False Day False Aircraft False Helicopter False Tank False APC False Field_Artillery False MRL False Military_Auto False Fuel_Tank False Drone False Naval_Ship False Anti_Aircraft_Warfare False Special_Equipment False Mobile_Srbm_System False Personnel False Pow False Equipment_Per_Day False dtype: bool
#We have no duplicated values
raw_df.duplicated().sum()
0
#Total losses of equipment
col_list = list(raw_df.columns[2:15])
Total = []
for i in raw_df.columns[2:15]:
Total.append(raw_df[i].sum())
Equipment = pd.DataFrame({'Equipment': col_list,'Total': Total})
Equipment
| Equipment | Total | |
|---|---|---|
| 0 | Aircraft | 3927.0 |
| 1 | Helicopter | 4201.0 |
| 2 | Tank | 19575.0 |
| 3 | APC | 60021.0 |
| 4 | Field_Artillery | 9149.0 |
| 5 | MRL | 3090.0 |
| 6 | Military_Auto | 36258.0 |
| 7 | Fuel_Tank | 2866.0 |
| 8 | Drone | 1756.0 |
| 9 | Naval_Ship | 188.0 |
| 10 | Anti_Aircraft_Warfare | 1619.0 |
| 11 | Special_Equipment | 493.0 |
| 12 | Mobile_Srbm_System | 62.0 |
#Total losses in personnel from our raw dataset
print(f'Total personnel losses: {raw_df.Personnel.sum()}')
Total personnel losses: 589150
Note: The dataset conatins a combination of equipments lost for both entities.
Before carrying out any analysis let us cross check our data with real estimates of Russian firepower in comparison with columns in our dataset.
Personnel / Manpower:
Total Military Personnel(est.) : 1,350,000
Total Military Personnel can be broken down into three categories:
Airpower:
Total Aircraft strength:4,173
Land Forces:
Naval Forces:
NOTE: COMPARE THIS DATA WITH THE TOTAL LOSSES RECORDED IN OUR DATASET
Our losses of helicopters in is far more than the inventory of helicopters in Russia.
The Ukrainian Air Force has: 69 fighters 29 dedicated attack aircraft 32 transports 71 trainers 5 aircraft designated as special mission 112 helicopters source
Combined inventory of helicopters is still less than the losses we have recorded.
The same can be said about the personnel lost.
IN ORDER TO PROCEED our first step will be to create a revised dataframe that will contain the equipment and personnel lossesfor every 24hrs.
#Assign the list of dates and days to two different variables
date = list(raw_df['Date'])
day = list(raw_df['Day'])
# diff() will be applied to columns in order to analyze the looses every 24hrs
df = raw_df[['Aircraft', 'Helicopter', 'Tank', 'APC',
'Field_Artillery', 'MRL', 'Military_Auto', 'Fuel_Tank', 'Drone',
'Naval_Ship', 'Anti_Aircraft_Warfare', 'Special_Equipment',
'Mobile_Srbm_System', 'Personnel', 'Pow', 'Equipment_Per_Day']].diff()
df['Date'] = date
df['Day'] = day
# Create a dataframe that will contain the values for every 24hrs
df = df.reindex(columns = ['Date', 'Day', 'Aircraft', 'Helicopter', 'Tank', 'APC',
'Field_Artillery', 'MRL', 'Military_Auto', 'Fuel_Tank', 'Drone',
'Naval_Ship', 'Anti_Aircraft_Warfare', 'Special_Equipment',
'Mobile_Srbm_System', 'Personnel', 'Pow', 'Equipment_Per_Day'])
df.head()
| Date | Day | Aircraft | Helicopter | Tank | APC | Field_Artillery | MRL | Military_Auto | Fuel_Tank | Drone | Naval_Ship | Anti_Aircraft_Warfare | Special_Equipment | Mobile_Srbm_System | Personnel | Pow | Equipment_Per_Day | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2022-02-25 | 2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | 2022-02-26 | 3 | 17.0 | 19.0 | 66.0 | 190.0 | 0.0 | 0.0 | 30.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1500.0 | 0.0 | 324.0 |
| 2 | 2022-02-27 | 4 | 0.0 | 0.0 | 4.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 200.0 | 0.0 | 5.0 |
| 3 | 2022-02-28 | 5 | 2.0 | 3.0 | 0.0 | 110.0 | 24.0 | 17.0 | 161.0 | 0.0 | 1.0 | 0.0 | 5.0 | 0.0 | 0.0 | 800.0 | 0.0 | 323.0 |
| 4 | 2022-03-01 | 6 | 0.0 | 0.0 | 48.0 | 30.0 | 3.0 | 3.0 | 14.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 410.0 | 200.0 | 100.0 |
#Save the values in an object
val = raw_df.iloc[0:1,2:18]
#Use the object to replace NaN
df.iloc[0:1,2:18] = val
df.describe()
| Day | Aircraft | Helicopter | Tank | APC | Field_Artillery | MRL | Military_Auto | Fuel_Tank | Drone | Naval_Ship | Anti_Aircraft_Warfare | Special_Equipment | Mobile_Srbm_System | Personnel | Pow | Equipment_Per_Day | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 | 43.000000 |
| mean | 24.418605 | 3.534884 | 3.186047 | 16.790698 | 44.441860 | 7.953488 | 2.511628 | 32.186047 | 1.767442 | 2.604651 | 0.162791 | 1.279070 | 0.581395 | 0.093023 | 448.837209 | 10.860465 | 117.093023 |
| std | 13.227375 | 3.984188 | 4.811968 | 16.834712 | 82.239575 | 10.823153 | 3.948309 | 40.733311 | 9.226869 | 3.506046 | 0.484534 | 1.816925 | 1.607017 | 0.426165 | 675.207937 | 32.499510 | 142.962002 |
| min | 2.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 5.000000 |
| 25% | 13.500000 | 1.000000 | 0.000000 | 7.500000 | 15.500000 | 2.000000 | 0.000000 | 13.500000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 100.000000 | 0.000000 | 60.500000 |
| 50% | 25.000000 | 2.000000 | 2.000000 | 13.000000 | 23.000000 | 5.000000 | 1.000000 | 22.000000 | 0.000000 | 2.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 200.000000 | 0.000000 | 88.000000 |
| 75% | 35.500000 | 4.500000 | 3.000000 | 16.500000 | 37.500000 | 10.000000 | 3.500000 | 34.500000 | 0.000000 | 3.000000 | 0.000000 | 2.000000 | 1.000000 | 0.000000 | 350.000000 | 9.000000 | 101.500000 |
| max | 46.000000 | 17.000000 | 20.000000 | 80.000000 | 516.000000 | 51.000000 | 17.000000 | 224.000000 | 60.000000 | 15.000000 | 2.000000 | 8.000000 | 10.000000 | 2.000000 | 3160.000000 | 200.000000 | 828.000000 |
def boxplot_plot(x):
for i in df.columns[2:]:
fig = plt.figure(figsize = (8,8))
x.boxplot(column = [i])
plt.grid(False)
plt.title(i+ ' outliers',fontstyle = 'italic')
plt.grid(axis = 'y',linestyle = '--',alpha = 0.5)
plt.tight_layout()
plt.show()
boxplot_plot(df)
Create a distplot for equipments, personnel and POW
def distplot_plot(x):
for i in df.columns[2:]:
fig = plt.figure(figsize = (15,5))
sns.distplot(x[i])
mean = x[i].mean()
median = x[i].median()
Q1 = x[i].quantile(0.25) #25th percentile
Q3 = x[i].quantile(0.75) #75th percentile
plt.axvline(mean,color = 'red',label = 'Mean')
plt.axvline(median,color = 'green',label = 'Median')
plt.axvline(Q1,color = 'yellow',label = '25th_ercentile')
plt.axvline(Q3,color = 'blue',label = '75th_percentile')
plt.legend()
plt.title('Distribution of ' + i,fontstyle = 'italic',fontsize = 12)
plt.grid(axis = 'y',linestyle = '--',alpha = 0.5)
plt.show()
distplot_plot(df)
We will avoid treating the outliers as any manipultaion of outliers will lead to a significant loss of data for different dates
In our case we need numerical values to accurately analyze the trend for different types of equipments, personnel and POW
#Total losses in personnel till the last date of our dataset
print(f'Total personnel losses after calculating the diff(): {df.Personnel.sum()}')
Total personnel losses after calculating the diff(): 19300.0
#Total losses in equipment till the last date of our dataset
col_list = list(df.columns[2:15])
Total = []
for i in df.columns[2:15]:
Total.append(df[i].sum())
Equipment_diff= pd.DataFrame({'Equipment_diff': col_list,'Total': Total})
Equipment_diff
| Equipment_diff | Total | |
|---|---|---|
| 0 | Aircraft | 152.0 |
| 1 | Helicopter | 137.0 |
| 2 | Tank | 722.0 |
| 3 | APC | 1911.0 |
| 4 | Field_Artillery | 342.0 |
| 5 | MRL | 108.0 |
| 6 | Military_Auto | 1384.0 |
| 7 | Fuel_Tank | 76.0 |
| 8 | Drone | 112.0 |
| 9 | Naval_Ship | 7.0 |
| 10 | Anti_Aircraft_Warfare | 55.0 |
| 11 | Special_Equipment | 25.0 |
| 12 | Mobile_Srbm_System | 4.0 |
#Create a line plot
fig = px.line(df,x = 'Date', y = 'Equipment_Per_Day',title = 'Trend in equipments lost')
fig.add_hline(y = df['Equipment_Per_Day'].max(),line_dash="dash", line_color="red",annotation_text="Max. val",opacity = 0.4)
fig.add_hline(y = df['Equipment_Per_Day'].min(),line_dash="dash", line_color="green",annotation_text="Min. val",opacity = 0.4)
#Update trace and layout
fig.update_traces(mode="markers+lines")
fig.update_layout(hovermode="x unified")
fig.show()
# Calculate the date for which the losses in equipments was the highest and the lowest
Max_val = df[['Date','Equipment_Per_Day']][df['Equipment_Per_Day'] == df['Equipment_Per_Day'].max()]
Min_val = df[['Date','Equipment_Per_Day']][df['Equipment_Per_Day'] == df['Equipment_Per_Day'].min()]
display(Max_val,Min_val)
| Date | Equipment_Per_Day | |
|---|---|---|
| 0 | 2022-02-25 | 828.0 |
| Date | Equipment_Per_Day | |
|---|---|---|
| 2 | 2022-02-27 | 5.0 |
So far the maximum amount of losses have been recorded on 2022-02-25 and the minimum on 2022-02-27. This will be subjected changes that will be dependent on data update.
Equipment with least losses
least_used_equipment = Equipment_diff[Equipment_diff['Total'] == Equipment_diff['Total'].min()]
least_used_equipment
| Equipment_diff | Total | |
|---|---|---|
| 12 | Mobile_Srbm_System | 4.0 |
Equipment with min. losses: Mobile Short Range Ballistic Missile (SRBM) System
Equipment with most losses
most_used_equipment = Equipment_diff[Equipment_diff['Total'] == Equipment_diff['Total'].max()]
most_used_equipment
| Equipment_diff | Total | |
|---|---|---|
| 3 | APC | 1911.0 |
Equipment with max. losses : APC
fig = px.bar(Equipment,x = 'Equipment',y = 'Total',log_y = True,title = 'Total of each equipment used',height = 700,
hover_data = ['Equipment','Total'],color = 'Equipment')
fig.show()
fig = px.bar(df,x = 'Date',y = ['Aircraft', 'Helicopter', 'Tank', 'APC',
'Field_Artillery', 'MRL', 'Military_Auto', 'Fuel_Tank', 'Drone',
'Naval_Ship', 'Anti_Aircraft_Warfare', 'Special_Equipment',
'Mobile_Srbm_System'],log_y = True,
height= 1000)
fig.update_layout(title = 'Equipments vs Dates',
xaxis_title = 'Date',
yaxis_title = 'Frequency of Equipments',
hovermode="x unified")
fig.show()
We create a function that comprises of three different operations:
def analysis(eq):
#Line plot
fig = px.line(df,x = 'Date',y = eq ,height = 450)
fig.update_layout(title = eq + ' % loss of over time',
xaxis_title = 'Date',
yaxis_title = 'Frequency',
hovermode="x unified")
fig.add_hline(y=df[eq].max(), line_dash="dot",
annotation_text="Max. Value",
annotation_position="bottom right",
line_color="green")
fig.add_hline(y=df[eq].min(), line_dash="dot",
annotation_text='Min. Value',
annotation_position="bottom right",
line_color="red")
fig.show()
print(f' Max.loss: {df[eq].max()}')
print(f' Min.loss: {df[eq].min()}')
#Create the dataframe
date = df.Date
losses = df[eq]
a = losses.diff()
df_new = pd.DataFrame({'Date': date,'Loss': losses,'Loss_percentage':a})
df_new['Loss_percentage'] = (df_new['Loss_percentage']/df_new['Loss'])*100
df_new['Loss_percentage'].replace(np.nan,0,inplace = True)
df_new['Loss_percentage'].replace(-np.inf,-100,inplace = True)
#Bar plot
fig = px.bar(df_new,x = 'Date',y = 'Loss_percentage',color = 'Loss',height = 600,title = '% losses over time')
fig.update_layout(hovermode="x unified")
fig.show()
#Return the dataframe
return df_new
AIRCRAFTS
Aircraft = analysis('Aircraft')
Max.loss: 17.0 Min.loss: 0.0
Aircraft.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 10.0 | 0.000000 |
| 1 | 2022-02-26 | 17.0 | 41.176471 |
| 2 | 2022-02-27 | 0.0 | -100.000000 |
| 3 | 2022-02-28 | 2.0 | 100.000000 |
| 4 | 2022-03-01 | 0.0 | -100.000000 |
HELICOPTER
Helicopter = analysis('Helicopter')
Max.loss: 20.0 Min.loss: 0.0
Helicopter.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 7.0 | 0.000000 |
| 1 | 2022-02-26 | 19.0 | 63.157895 |
| 2 | 2022-02-27 | 0.0 | -100.000000 |
| 3 | 2022-02-28 | 3.0 | 100.000000 |
| 4 | 2022-03-01 | 0.0 | -100.000000 |
TANK
Tank = analysis('Tank')
Max.loss: 80.0 Min.loss: 0.0
Tank.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 80.0 | 0.000000 |
| 1 | 2022-02-26 | 66.0 | -21.212121 |
| 2 | 2022-02-27 | 4.0 | -1550.000000 |
| 3 | 2022-02-28 | 0.0 | -100.000000 |
| 4 | 2022-03-01 | 48.0 | 100.000000 |
APC
APC = analysis('APC')
Max.loss: 516.0 Min.loss: 0.0
APC.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 516.0 | 0.000000 |
| 1 | 2022-02-26 | 190.0 | -171.578947 |
| 2 | 2022-02-27 | 0.0 | -100.000000 |
| 3 | 2022-02-28 | 110.0 | 100.000000 |
| 4 | 2022-03-01 | 30.0 | -266.666667 |
FIELD ARTILLERY
Field_Artillery= analysis('Field_Artillery')
Max.loss: 51.0 Min.loss: 0.0
Field_Artillery.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 49.0 | 0.000000 |
| 1 | 2022-02-26 | 0.0 | -100.000000 |
| 2 | 2022-02-27 | 1.0 | 100.000000 |
| 3 | 2022-02-28 | 24.0 | 95.833333 |
| 4 | 2022-03-01 | 3.0 | -700.000000 |
MRL
MRL = analysis('MRL')
Max.loss: 17.0 Min.loss: 0.0
MRL.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 4.0 | 0.000000 |
| 1 | 2022-02-26 | 0.0 | -100.000000 |
| 2 | 2022-02-27 | 0.0 | 0.000000 |
| 3 | 2022-02-28 | 17.0 | 100.000000 |
| 4 | 2022-03-01 | 3.0 | -466.666667 |
MILITARY AUTO
Military_Auto = analysis('Military_Auto')
Max.loss: 224.0 Min.loss: 0.0
Military_Auto.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 100.0 | 0.000000 |
| 1 | 2022-02-26 | 30.0 | -233.333333 |
| 2 | 2022-02-27 | 0.0 | -100.000000 |
| 3 | 2022-02-28 | 161.0 | 100.000000 |
| 4 | 2022-03-01 | 14.0 | -1050.000000 |
FUEL TANK
Fuel_Tank = analysis('Fuel_Tank')
Max.loss: 60.0 Min.loss: 0.0
Fuel_Tank.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 60.0 | 0.0 |
| 1 | 2022-02-26 | 0.0 | -100.0 |
| 2 | 2022-02-27 | 0.0 | 0.0 |
| 3 | 2022-02-28 | 0.0 | 0.0 |
| 4 | 2022-03-01 | 0.0 | 0.0 |
DRONES
Drones = analysis('Drone')
Max.loss: 15.0 Min.loss: 0.0
Drones.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 0.0 | 0.0 |
| 1 | 2022-02-26 | 2.0 | 100.0 |
| 2 | 2022-02-27 | 0.0 | -100.0 |
| 3 | 2022-02-28 | 1.0 | 100.0 |
| 4 | 2022-03-01 | 0.0 | -100.0 |
NAVAL SHIP
Naval_Ship = analysis('Naval_Ship')
Max.loss: 2.0 Min.loss: 0.0
Naval_Ship.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 2.0 | 0.0 |
| 1 | 2022-02-26 | 0.0 | -100.0 |
| 2 | 2022-02-27 | 0.0 | 0.0 |
| 3 | 2022-02-28 | 0.0 | 0.0 |
| 4 | 2022-03-01 | 0.0 | 0.0 |
ANTI AIRCRAFT WARFARE
Anti_Aircraft_Warfare = analysis('Anti_Aircraft_Warfare')
Max.loss: 8.0 Min.loss: 0.0
Anti_Aircraft_Warfare.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 0.0 | 0.0 |
| 1 | 2022-02-26 | 0.0 | 0.0 |
| 2 | 2022-02-27 | 0.0 | 0.0 |
| 3 | 2022-02-28 | 5.0 | 100.0 |
| 4 | 2022-03-01 | 2.0 | -150.0 |
SPECIAL EQUIPMENT
Special_Equipment = analysis('Special_Equipment')
Max.loss: 10.0 Min.loss: 0.0
Special_Equipment.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 0.0 | 0.0 |
| 1 | 2022-02-26 | 0.0 | 0.0 |
| 2 | 2022-02-27 | 0.0 | 0.0 |
| 3 | 2022-02-28 | 0.0 | 0.0 |
| 4 | 2022-03-01 | 0.0 | 0.0 |
MOBILE SRBM SYSTEM
Mobile_SRBM_System = analysis('Mobile_Srbm_System')
Max.loss: 2.0 Min.loss: 0.0
Mobile_SRBM_System.head()
| Date | Loss | Loss_percentage | |
|---|---|---|---|
| 0 | 2022-02-25 | 0.0 | 0.0 |
| 1 | 2022-02-26 | 0.0 | 0.0 |
| 2 | 2022-02-27 | 0.0 | 0.0 |
| 3 | 2022-02-28 | 0.0 | 0.0 |
| 4 | 2022-03-01 | 0.0 | 0.0 |
PERSONNEL
Personnel = analysis('Personnel')
Max.loss: 3160.0 Min.loss: 0.0
PRISONER OF WAR
POW = analysis('Pow')
Max.loss: 200.0 Min.loss: 0.0
Personnel_pow_plt = df[['Personnel','Pow']].sum().plot.bar(rot = 0,figsize = (10,10),cmap = 'plasma')
plt.ylabel('Sum',fontsize = 13)
plt.title('Personnel and POW loss',fontsize = 15)
plt.grid(axis = 'y',linestyle = '--',alpha = 0.8)
plt.bar_label(Personnel_pow_plt.containers[0],size = 10)
plt.show()
Create a database for the aerial equipments lost.
Date = df.Date
Aircraft = df.Aircraft
Helicopter = df.Helicopter
Drone = df.Drone
Anti_Aircraft_Warfare = df.Anti_Aircraft_Warfare
Mobile_Srbm_System = df.Mobile_Srbm_System
df_aerial = pd.DataFrame({'Date': Date,'Aircaft':Aircraft,'Helicopter':Helicopter,'Drone':Drone,
'Anti_Aircraft_Warfare':Anti_Aircraft_Warfare,'Mobile_Srbm_System':Mobile_Srbm_System})
df_aerial['Total_Losses_Aerial'] = df_aerial[df_aerial.columns[1:6]].sum(axis = 1)
df_aerial.head()
| Date | Aircaft | Helicopter | Drone | Anti_Aircraft_Warfare | Mobile_Srbm_System | Total_Losses_Aerial | |
|---|---|---|---|---|---|---|---|
| 0 | 2022-02-25 | 10.0 | 7.0 | 0.0 | 0.0 | 0.0 | 17.0 |
| 1 | 2022-02-26 | 17.0 | 19.0 | 2.0 | 0.0 | 0.0 | 38.0 |
| 2 | 2022-02-27 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 3 | 2022-02-28 | 2.0 | 3.0 | 1.0 | 5.0 | 0.0 | 11.0 |
| 4 | 2022-03-01 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 2.0 |
We will create a new dataframe from our dataframe of losses for aerial equipments.
This dataframe will help us see the contribution of different aerial equipments (percentage) towards the total of aerial equipments lost
col_list = list(df_aerial.columns[1:6])
Percentage = []
for i in df_aerial.columns[1:6]:
Percentage.append(round((df_aerial[i].sum()/df_aerial['Total_Losses_Aerial'].sum())*100,1))
df_aerial_percentage = pd.DataFrame({'Aerial Equipment': col_list, '%_of_sum_loss_aerial':Percentage})
df_aerial_percentage
| Aerial Equipment | %_of_sum_loss_aerial | |
|---|---|---|
| 0 | Aircaft | 33.0 |
| 1 | Helicopter | 29.8 |
| 2 | Drone | 24.3 |
| 3 | Anti_Aircraft_Warfare | 12.0 |
| 4 | Mobile_Srbm_System | 0.9 |
fig = px.pie(df_aerial_percentage, values='%_of_sum_loss_aerial',
names='Aerial Equipment', color_discrete_sequence=px.colors.sequential.Blues)
fig.show()
Create a database for the ground equipments lost.
Date = df.Date
Tank = df.Tank
APC = df.APC
Field_Artillery = df.Field_Artillery
MRL = df.MRL
Military_Auto = df.Military_Auto
Fuel_Tank = df.Fuel_Tank
df_ground = pd.DataFrame({'Date':Date,'Tank': Tank,'APC': APC,'Field_Artillery':Field_Artillery,
'MRL':MRL,'Military_Auto':Military_Auto,'Fuel_Tank':Fuel_Tank})
df_ground['Total_Losses_Ground'] =df_ground[df_ground.columns[1:7]].sum(axis = 1)
df_ground.head()
| Date | Tank | APC | Field_Artillery | MRL | Military_Auto | Fuel_Tank | Total_Losses_Ground | |
|---|---|---|---|---|---|---|---|---|
| 0 | 2022-02-25 | 80.0 | 516.0 | 49.0 | 4.0 | 100.0 | 60.0 | 809.0 |
| 1 | 2022-02-26 | 66.0 | 190.0 | 0.0 | 0.0 | 30.0 | 0.0 | 286.0 |
| 2 | 2022-02-27 | 4.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 5.0 |
| 3 | 2022-02-28 | 0.0 | 110.0 | 24.0 | 17.0 | 161.0 | 0.0 | 312.0 |
| 4 | 2022-03-01 | 48.0 | 30.0 | 3.0 | 3.0 | 14.0 | 0.0 | 98.0 |
We will create a new dataframe from our dataframe of losses for ground equipments.
This dataframe will help us see the contribution of different ground equipments (percentage) towards the total of ground equipments lost
col_list = list(df_ground.columns[1:7])
Percentage = []
for i in df_ground.columns[1:7]:
Percentage.append(round((df_ground[i].sum()/df_ground['Total_Losses_Ground'].sum())*100,1))
df_ground_percentage = pd.DataFrame({'Ground Equipment': col_list, '%_of_sum_loss_ground':Percentage})
df_ground_percentage
| Ground Equipment | %_of_sum_loss_ground | |
|---|---|---|
| 0 | Tank | 15.9 |
| 1 | APC | 42.1 |
| 2 | Field_Artillery | 7.5 |
| 3 | MRL | 2.4 |
| 4 | Military_Auto | 30.5 |
| 5 | Fuel_Tank | 1.7 |
fig = px.pie(df_ground_percentage, values='%_of_sum_loss_ground',
names='Ground Equipment', color_discrete_sequence=px.colors.sequential.RdBu)
fig.show()
Create a database for the naval equipments lost.
Date = df.Date
Naval = df.Naval_Ship
df_naval = pd.DataFrame({'Date': Date,'Naval': Naval})
df_naval.head()
| Date | Naval | |
|---|---|---|
| 0 | 2022-02-25 | 2.0 |
| 1 | 2022-02-26 | 0.0 |
| 2 | 2022-02-27 | 0.0 |
| 3 | 2022-02-28 | 0.0 |
| 4 | 2022-03-01 | 0.0 |
#Plot a line plot to analyze ground, aerial and naval equipments lost
fig = px.line(x = df.Date, y = [df_aerial.Total_Losses_Aerial,df_ground.Total_Losses_Ground,df_naval.Naval])
names={'wide_variable_0':'Aerial Equipments','wide_variable_1':'Ground Equipments ','wide_variable_2':'Naval Equipments'}
fig.for_each_trace(lambda x: x.update(name = names[x.name]))
fig.update_layout(hovermode="x unified",yaxis_title = 'Frequency',title = 'Aerial Equipments vs Ground Equipments vs Naval Equipments')
fig.show()
The war signals towards a heavy loss of ground equipments. Whereas the use of naval equipments is almost null. Let us calculate the total for each category.
df_eq = pd.DataFrame()
df_eq['Total_Naval_Losses'] = df_naval.Naval
df_eq['Total_Ground_losses'] = df_ground.Total_Losses_Ground
df_eq['Total_Aerial_Losses'] = df_aerial.Total_Losses_Aerial
eq_plt = df_eq[['Total_Naval_Losses','Total_Ground_losses','Total_Aerial_Losses']].sum().plot.bar(rot = 0,figsize = (10,10),cmap = 'summer')
plt.ylabel('Sum',fontsize = 13)
plt.title('Sum for categories of equipment',fontsize = 15)
plt.grid(axis = 'y',linestyle = '--',alpha = 0.8)
plt.bar_label(eq_plt.containers[0],size = 10)
plt.semilogy()
plt.show()
We can see that the amount of losses of ground equipment is far more than the losses in other categories till now.